System and method for the storage, indexing and retrieval of XML documents using relation databases

ABSTRACT

A system and method for assigning attributes to XML document nodes to facilitate their storage in relational databases and the subsequent retrieval and re-construction of pertinent nodes and fragments in original document order is provided. Since these queries are performed using relational database query engines, the speed of their execution is significantly faster than that using more exotic systems such as object-oriented databases. Furthermore, this method is portable across all vendor platforms, and so can be deployed at client sites without additional investments in database software.

PRIORITY CLAIM

[0001] This application claims priority under 35 USC §§ 119 and 120 fromU.S. Provisional Patent Application No. 60/169,101 filed Dec. 6, 1999.

BACKGROUND OF THE INVENTION

[0002] This invention relates generally to a system and method forstoring documents in one format in a database having a different formatand in particular to a system and method for storing and retrievingextensible Markup Language (XML) documents using a relational database.

[0003] The new extensible Markup Language (XML) protocol is poised tobecome the lingua franca of the Internet for capturing andelectronically transmitting information. The advantage of XML, ascompared to the older hypertext markup language protocol (HTML), is thatit contains tags which render semantic significance to the informationbetween the tags (e.g., the text between the tags is the last name of anauthor). In contrast, HTML tags are used primarily for specifying howthe information is to be displayed in a browser (e.g., show the textbetween the tags in bold Arial font). Additionally, using knownextensible Stylesheets (written in XSL), one may specify not only theformat of how different XML elements are to be shown in a browser, butalso the order in which they are to be displayed. These features of XMLgive a user much greater power and flexibility in searching for relevantinformation since a search may be performed using the tags that containthe semantic information. In addition, XML permits examining theinformation from different perspectives once it is found by the user.

[0004] To take full advantage of the possibilities that the XML protocolaffords, it is desirable to devise an efficient means of storing,indexing and retrieving (via queries) XML documents. Typical RDMS, ODMSand flat files are slow and inefficient at storing XML documents. Apreferred way of building Document Object Model (DOM) representations ofthe XML documents and then traversing the resulting trees to locaterelevant nodes is only acceptable for small documents since memorybecomes a limiting factor when the XML documents approach even moderatesizes. In addition, searches are not optimal since all searches mustbegin at the root of the document instead of at any node in thedocument. Moreover, it is not possible to search across a collection ofdocuments (e.g. poems, novels, short stories and plays) for a particularcharacter or the author.

[0005] At the same time, XML documents present unique challenges tostorage in relational databases since their semi-structured nature oftenleads to a proliferation of tables when normalization is carried out.Given that relational database technology has seen great strides overthe past couple of decades, it would be desirable and useful to providea clean way of representing XML documents in relational terms. It istherefore the goal of the present invention to provide a system andmethod for the storage, indexing and retrieval of XML documents usingrelational databases.

SUMMARY OF THE INVENTION

[0006] A system and method for storing, indexing and retrieving XMLdocuments in a relational database is provided in accordance with theinvention. The method may include identifying and assigning propertiesand encodings to the nodes of an XML document that will make themamenable to storage and retrieval using relational databases. The methodhas several advantages. It allows the system to capture and reproducethe structure of not only the whole document, but fragments of eachdocument as well. It also permits a user to traverse the XML tree,figuratively, by means of string manipulation queries instead offollowing pointers in memory or computing joins between tables, whichare computationally more expensive operations. Finally, the propertiesand encodings that are attached to the nodes are compact and can beeffectively indexed, thus enhancing the performance of queries againstthe database.

[0007] The system in accordance with the invention uses any relationaldatabase management system to store the XML documents so that the systemand method are not dependent on any particular relational databaseimplementation. The system permits a user to search through the XMLdocuments stored in the relational database from any node elementwithout starting from the root element of the document. This providesoptimal efficiency during search and retrieval that can not be obtainedusing other methods today. In addition, a document may be constructedfrom any node and its descendants. The system also permits documentsconforming to any XML schema to be stored in an efficient manner. Thesystem can also store any well formed XML document that do not conformto any schema or DTD (Document Type Definition). This is an importantfeature as a large majority of XML documents generated do not conform toa schema or DTD.

[0008] In accordance with the invention, the system may include aconverter and a searcher that permit XML documents to be stored in therelational database and retrieved from a relational database usingtypical SQL queries. In a preferred embodiment, the converter andsearcher may be one or more software modules being executed by a centralprocessing unit on a computer system. In accordance with the invention,the method for storing the XML documents may include the steps ofgenerating an XMLName value for each element in the document tree,generating a NamePath value for each node of the document and generatingan OrderPath value for each node of the document. Collectively,assigning values to these elements are called encodings. These encodingsresult in efficient storage, indexing and searching of XML documentswithout destroying the underlying hierarchical structure of thedocuments. The retrieval of the XML documents once they are in therelational database is relatively easy since typical string matching SQLqueries may be used.

[0009] Thus, in accordance with the invention, a computer system andmethod for manipulating an XML document using a relational database isprovided. The system comprises a converter that receives an XML documentand generates a set relational database tables based on the hierarchicalstructure of XML a database for storing the relational database tables,and a searcher for querying the generated relational database table inthe database to locate content originally in the XML document that isnow stored in the relational database tables wherein the located contentis returned to the user as an XML document or a portion of an XMLdocument as desired by the user which can be another software module.The invention also includes the searcher that can convert queriesspecified on the XML document or document collections and convert themto simple SQL queries to retrieve the content desired by the user.

[0010] In accordance with another aspect of the invention, a computersystem for storing an XML document using a relational database isprovided wherein the system comprises a converter that receives an XMLdocument and generates relational database tables based on the structureof the XML document. The converter further comprises a software modulethat generates a unique name attribute for each node in the XMLdocument, a software module that generates a path attribute for aparticular node of the XML document wherein the path attribute comprisesa list of the name attributes for the one or more nodes from theparticular node to a root node of the XML document, a software modulethat generates an order attribute for the particular node, the orderattribute comprising an enumerated order of the particular node from theroot node to the particular node, and a software module that generates aNodeValue attribute containing a value of the particular node.Collectively these attributes are called encodings that result inefficient storage, indexing and searching of XML documents withoutdestroying the underlying hierarchical structure of the documents.

[0011] In accordance with yet another aspect of the invention, a datastructure that stores a node of interest of an XML document in arelational database is provided. The data structure comprises an XMLNameattribute comprising a unique name for the node of interest, a NamePathattribute comprising a list of the XMLName attributes for the one ormore nodes from the node of interest to a root node of the XML document,an OrderPath attribute comprising an enumerated order of the node ofinterest from the root node to the node of interest, and a NodeValueattribute containing a value of the node of interest. Collectively theseattributes are called encodings that result in efficient storage,indexing and searching of XML documents without destroying theunderlying hierarchical structure of the documents.

BRIEF DESCRIPTION OF THE DRAWINGS

[0012]FIG. 1 is a diagram illustrating a personal computerimplementation of an XML document storage and retrieval system inaccordance with the invention;

[0013]FIG. 2 is a diagram illustrating more details of the XML documentstorage and retrieval system in accordance with the invention;

[0014]FIG. 3 is a diagram illustrating an example of a document typedefinition (DTD) tree for an XML document;

[0015]FIG. 4 is a diagram illustrating an XML document corresponding tothe table shown in FIG. 3;

[0016]FIG. 5 is a flowchart illustrating an example of a method forstoring XML documents in a relational database in accordance with theinvention; and

[0017]FIG. 6 is a flowchart illustrating a method for retrieving an XMLdocument from a search of a relational database in accordance with theinvention.

DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT

[0018] The invention is particularly applicable to a softwareimplemented XML document storage and retrieval system and method and itis in this context that the invention will be described. It will beappreciated, however, that the system and method in accordance with theinvention has greater utility since it may be implemented in hardwareinstead of software.

[0019]FIG. 1 is a block diagram illustrating an embodiment of asoftware-based XML document storage and retrieval system 20 inaccordance with the invention. In this embodiment, the storage andretrieval system 20 may be executed by a computer 22. The computer 22may be a typical stand-alone personal computer, a computer connected toa network, a client computer connected to a server or any other suitablecomputer system. For purposes of illustration only, an embodiment usinga stand-alone computer 22 will be described herein.

[0020] The computer 22 may include a central processing unit (CPU) 28, amemory 30, a persistent storage device 32, such as a hard disk drive, atape drive, an optical drive or the like and a storage and retrievalsystem 34. In a preferred embodiment, the storage and retrieval systemmay be one or more software applications stored in the persistentstorage device 32 of the computer that may be loaded into the memory 30so that the storage and/or retrieval functionality of the storage andretrieval system may be executed by the CPU 28. The computer 22 may beconnected to a remote server or other computer networks that permit thecomputer 22 to network with and share the stored XML document with othercomputers or to perform searches on XML stored documents on othercomputer systems.

[0021] The computer 22 may further include one or more input devices 36,such as a keyboard 38, a mouse 40, a joystick or the like, a display 42such as a typical cathode ray tube, a flat panel display or the like andone or more output devices (not shown) such as a printer for producingprinted output of the search results. The input and output devicespermit a user of the computer to interact with the storage and retrievalsystem so that the user may, for example, enter a query using the inputdevices and view the results of the query on the display or print thequery results.

[0022] As described below in more detail, the storage and retrievalsystem 34 may include one or more different software modules thatprovide XML document storage capabilities and XML document retrievalcapabilities in accordance with the invention. Now, more details of thestorage and retrieval system will be described.

[0023]FIG. 2 is a diagram illustrating more details of the XML documentstorage and retrieval system 34 in accordance with the invention. Thesystem may include a converter module 50, a searcher module 52 and arelational database 54. Each of the modules may be implemented, in apreferred embodiment, as a software application being executed by a CPUas described above. The relational database 54 may be any type ofrelational database so that the system 34 in accordance with theinvention may be used to store XML documents in any relational databasesystem.

[0024] The converter module 50 accepts XML documents, processes them andoutputs relational data about the XML documents as described below thatis stored in the typical relational database 54. The searcher module 52generates a user interface to a user, permits the user to enter a textstring type relational database query, processes the query bycommunicating a query to the relational database 54 and sends theresults of the query in its original XML form to the user so that theuser may view or print the query results. In combination, the twomodules shown permit XML documents to be stored in any relationaldatabase system and then permits a user to enter a typical text stringrelational database query in order to retrieve XML documents from therelational database that match the text string query. Each of thesemodules will be described in more detail below. Now, an example of aDocument Type Definition (DTD) of an XML document will be described tobetter understand the invention. This example of the DTD will be used asan example to illustrate the storage and retrieval system in accordancewith the invention.

[0025]FIG. 3 is a diagram illustrating an example of a Document TypeDefinition (DTD) tree 60 for an XML document. Although not required todo so, an XML document typically conforms to a DTD which, looselyspeaking, is a schema for the data found in the document. However, XMLdocuments are semi-structured in the sense that there are elementsspecified in the DTD that may be optionally present and some that may bepresent more than once. This is in contrast to typical relationaldatabase tables where each record must have either zero (if it is NULL)or only one value for an attribute.

[0026] XML documents also resemble an object-oriented database in thatthere are parent-child relationships between elements which are notfound between attributes in a relational database. The following exampleof an XML document should help make these distinctions more clear. Anexample of the XML DTD syntax may be:

[0027] <!ELEMENT library (book*, periodical*)>

[0028] <!ELEMENT book (title, author+)>

[0029] <!ATTLIST book edition CDATA #REQUIRED>

[0030] <!ELEMENT author (title?, firstname, lastname)>

[0031] In the above example, elements that appear within parentheses arethe children of elements before the parentheses. In addition a “*”denotes 0 or more occurrences of the element, a “+” denotes one or moreoccurrences and a “?” denotes 0 or 1 occurrence. The above example DTDmay be represented by the DTD tree shown in FIG. 3. The DTD tree 60 mayinclude a root node 62 (containing the element “library” in thisexample), one or more intermediate nodes 64 and one or more leaf nodes66 that do not have any further nodes attached to them. An example of anXML document 70 that conforms to the DTD is shown in FIG. 4. It containsthe instances of elements in the DTD tree along with data for eachelement. The conversion of this example of an XML document into a formatthat may be stored in a relational database in accordance with theinvention will now be described.

[0032]FIG. 5 is a flowchart illustrating an example of a method 80 forstoring XML documents in a relational database in accordance with theinvention. The method involves computing three properties, each of whichis described below, for each XML document node so that the XML documentmay be stored, in an efficient manner, in a relational database. Theencoding scheme set forth below is a preferred encoding embodiment.However, other encoding schemes may also be used. For example, theencoding set forth below (e.g., 1/2/5/6) may be represented as 1 raisedto the power 1, 2 raised to the power 2, 3 raised to the power 5 and 4raised to the power 6 and so on. That way, instead of performing stringmanipulation, the system would be doing factorization. Based on thisother encoding, the factorization approach can generate faster queriesand save indexing and database space. Thus, the invention is not limitedto any particular encoding and the encodings in accordance with theinvention are created based on the structure of the document and thenthe encodings are used to store, index and search for the content whilepreserving the hierarchy of the document.

[0033] In a first step 81 of the method, it is determined if an elementis ready for processing. If there is an element ready for processing,then the method generates an XMLName property for the particularelement. If an element is not ready for processing, but an attribute ofthe XML document is read for processing, then the method also generatesthe XMLName property for the particular attribute. In more detail, themethod starts by assigning each element name a unique XMLName property(in this example, the property is alphanumeric). For the example above,we could assign the XMLNames as shown in Table 1 (the XMLName Table).TABLE 1 (the “XMLName Table”) Element or Attribute Name XMLName library1 book 2 periodical 3 edition 4 title 5 author 6 firstname 7 lastname 8

[0034] Note that “title” gets only one XMLName value even though theelement appears twice in the DTD tree as either the title of a book orthe title of an author. This allows for more XMLName attributes to beencoded given strings of a specific length.

[0035] Now, in step 84, a NamePath value is automatically determined foreach node of the DTD tree. In particular, the NamePath value may beconstructed from the XMLNames of each node on the path from the rootnode to the node of interest. From this analysis, we obtain thefollowing table of NamePath values for the example XML document:NamePath Table DTD Node NamePath library 1 library/book 1/2library/periodical 1/3 library/book/edition 1/2/4 library/book/title1/2/5 library/book/author 1/2/6 library/book/author/title 1/2/6/5library/book/author/firstname 1/2/6/7 library/book/author/lastname1/2/6/8

[0036] As shown in the table, each DTD node, such as“library/book/author/lastname” has a corresponding NamePath value, suchas “1/2/6/8”. In this manner, using the NamePath values, it is possibleto navigate through the XML document using the relational database. Inother words, using this table, the path to any node in the DTD tree (andhence the XML document) may be easily determined. This table may also bestored in the relational database.

[0037] Next, in step 86, the method may automatically generate anOrderPath value for each node in the XML document. In particular, eachnumber in the slash-separated OrderPath (see the table below) denotesthe breadth-wise enumerated order of the node on the path from the rootto the node of interest. Each document node may also inherit theNamePath of the DTD node of which it is an instance. A full DocNodeTable for the example XML document looks like this: DocNode TableNodeName NamePath OrderPath Node Value library 1 1 book 1/2 1/1 edition1/2/4 1/1/1 first title 1/2/5 1/1/2 The XML Revolution author 1/2/61/1/3 title 1/2/6/5 1/1/3/1 Software Engineer

[0038] firstname 1/2/6/7 1/1/3/2 David lastname 1/2/6/8 1/1/3/3Hollenbeck author 1/2/6 1/1/4 title 1/2/6/5 1/1/4/1 Chief Architectfirstname 1/2/6/7 1/1/4/2 Carol lastname 1/2/6/8 1/1/4/3 Bohr book 1/21/2 edition 1/2/4 1/2/1 second title 1/2/5 1/2/2 Java Classes for XMLauthor 1/2/6 1/2/3 firstname 1/2/6/7 1/2/3/1 Carol lastname 1/2/6/81/2/3/2 Hollenbeck author 1/2/6 1/2/4 title 1/2/6/5 1/2/4/1 XML Gurufirstname 1/2/6/7 1/2/4/2 David lastname 1/2/6/8 1/2/4/3 Bohr

[0039] As shown in the Table that may be stored in a relationaldatabase, each document node may include a NodeName value (the name ofthe element), a NamePath value (See above), an OrderPath Value(automatically generated during this step), and a NodeValue value(containing the actual data in that particular node).

[0040] In step 88, the method determines if there are any more nodes toprocess and loops back to step 81 if there are more nodes. If all of thenodes have been processed, then the DocNode Table may be saved in therelational database. In this manner, an XML document is automaticallyprocessed in order to generate a DocNode Table that may be stored in anyrelational database. Once the DocNode table is generated by the system,it may be searched as will now be described in more detail.

[0041]FIG. 6 is a flowchart illustrating a method 100 for retrieving anXML document from a search of a relational database in accordance withthe invention. In step 102, the user or the system using user input, maygenerate a relational database query. In step 104, the system may querythe relational database and in step 106, the query results are output tothe user. In accordance with the invention, the system may convert thequery results back into references to portions of the XML document sothat the user may review the portions of the XML document retrievedduring the search in step 108. Now, several examples of retrieving XMLdocuments based on a relational database search will be provided. Inparticular, a few examples will be shown of how the system may use theNamePath and OrderPath values to select nodes with desired attributesfrom the XML document repository and also may construct fragments of theoriginal XML documents containing these selected nodes. In all thesample queries below, we assume that we know the context (i.e., theposition within the DTD tree) of the nodes we are interested in.

[0042] In a first example, a user wants to query the XML documentrepository to return the titles of all books who have an author with thetitle of “Chief Architect”. Since we know the context of title (i.e.,library/book/author/title), we can consult the XMLName Table to obtainthe relevant XMLNames and construct the NamePath of title which is“1/2/6/5” in this example. Then, the system may issue the first querythat is:

[0043] “Select OrderPath from DocNodeTable where NamePath=‘1/2/6/5’ andNodeValue=‘Chief Architect’”

[0044] This query returns an OrderPath of “1/1/4/1” as the result. Sincewe also know that the element “book” is a grand-parent of element“title”, we can deduce that its OrderPath is 1/1. Finally we constructthe NamePath of the element “book title” as “1/2/5” and execute thesecond query that is:

[0045] “Select NodeValue from DocNodeTable where NamePath=‘1/2/5’ andOrderPath like ‘1/1/%’”.

[0046] This second query returns the value “The XML Revolution” as theresult. This result accomplishes the user goal of returning all bookswhose author's title is “Chief Architect”. In this manner, the XMLdocument repository is queried using typical relational databasequeries.

[0047] In this second example, the user wants to search for the titlesof all books who have an author by the name of Carol Hollenbeck. Toaccomplish this, the system may generate a first query to select theOrderPaths of all firstname nodes with the value Carol:

[0048] “Select OrderPath from DocNodeTable where NamePath=‘1/2/6/7’ andNodeValue=‘Carol’”.

[0049] This query returns “1/1/4/2” and “1/2/3/1” as the result set.Next, a second query is generated to select the OrderPaths of alllastname nodes with the value Hollenbeck:

[0050] “Select OrderPath from DocNodeTable where NamePath=‘1/2/6/8’ andNodeValue=‘Hollenbeck’”

[0051] This query returns “1/1/3/3” and “1/2/3/2” as the result set.Since we know firstname and lastname nodes of the same person belong tothe same parent author node, we can deduce from the result sets thatonly the nodes with OrderPaths “1/2/3/1 ” and “1/2/3/2” are of interestto us. Thus, we want the title of the book with OrderPath 1/2, which wecan retrieve with the following query:

[0052] “Select NodeValue from DocNodeTable where NamePath=‘1/2/5’ andOrderPath like ‘1/2/%’”

[0053] This query returns “Java Classes for XML” as the result which isthe proper result.

[0054] In a third example, the user wants to be returned all theinformation pertaining to the authors of “The XML Revolution” andpresented in the original document order. Thus, first, the OrderPath ofthe relevant title node is determined by the following query:

[0055] “Select OrderPath from DocNodeTable where NamePath=‘1/2/5’ andNodeValue=‘The XML Revolution’”

[0056] This query returns “1/1/2” as the result. Thus, as a result ofthe first query, we know that the OrderPath of the relevant book node is“1/1”. Since the nodes for all author information are descendants of theauthor node (that has NamePath “1/2/6”), which in turn is a child of the“book” node, we can execute the following query to obtain the requiredresult:

[0057] “Select NodeValue from DocNodeTable where NamePath like ‘1/2/6/%’and OrderPath like ‘1/1/%’ Order by OrderPath”

[0058] This query returns “Software Engineer, David, Hollenbeck, ChiefArchitect, Carol, Bohr” in the original document order as the resultset.

[0059] Now, several enhancements to the system and method describedabove will be provided. In accordance with another aspect of theinvention, the XMLName Table may be cached in memory. In particular, tofacilitate construction of the NamePath values, we can store thecontents of XMLName Table in a hash table which we keep resident inmemory. This prevents the execution of multiple queries against thedatabase to obtain all the necessary XMLName values. In accordance withyet another aspect of the invention, the XMLName values may be dividedinto NameSpaces. In particular, as the number of XMLName valuesincreases, it may become necessary to divide the values into variousnamespaces to keep the lengths of the names short. XMLName values fromnamespaces relevant for working with a particular document can then bebrought into the cache when necessary without having to bring the entireXMLNameTable into memory.

[0060] In accordance with yet another aspect of the invention, thesystem may use base-64 encoding. In particular, to reduce the amount ofstorage required for the XMLName, NamePath, and OrderPath tables in therelational database, we could consider using a Base-64 encoding schemeinstead of alphanumeric strings. In accordance with the invention, it isalso possible to add a DigitPath attribute as an adjunct attribute toOrderPath so that the system can ensure proper sorting of nodes whileobviating the need for place-holding characters as the number ofcharacters increases. For example, to sort the paths “1/10/2” and“1/2/3” properly, the system would have needed to encode the second as“11-2/3”. However, if we added “1/2/1” and “1/1/1” as DigitPaths andordered the results by these before OrderPaths, then we would be able todo without the place-holding dashes.

[0061] In accordance with the invention, a ReverseNamePath attribute maybe automatically generated to further improve the speed of queries. Inparticular, since it is possible to have an XML document that is aninstance of a DTD sub-tree, we may need to evaluate an expression suchas:

[0062] “Select NodeValue from DocNodeTable where NamePath like ‘%/2/3’”

[0063] Since indexes built on NamePath generally do not help in theexecution of such queries, we can improve performance by having aReverseNamePath attribute constructed by reversing the order of theXMLNames in the path expression. Thus, in accordance with the invention,the above query would now read:

[0064] “Select NodeValue from DocNodeTable where ReverseNamePath like‘3/2/1/%’”

[0065] In accordance with the invention, the system may include atransformation engine that converts XPath expressions into equivalentSQL statements involving NamePath and OrderPath attributes so that theconverted queries would then be executed against the repository.

[0066] In summary, a system and method for assigning attributes to XMLdocument nodes to facilitate their storage and indexing in relationaldatabases and the subsequent retrieval and re-construction of pertinentnodes and fragments in original document order is provided. Since thesequeries are performed using relational database query engines, the speedof their execution is significantly faster than that using more exoticsystems such as object-oriented databases. Furthermore, this method isportable across all vendor platforms, and so can be deployed at clientsites without additional investments in database software.

[0067] In accordance with the invention, the hierarchical relationshipsof XML documents are encoded so that the XML documents may be mapped toa set of relational tables. Once the mapping and encoding is completed,then searching and querying of the XML documents may be done by mappingany XML query language (which is well known) to SQL (also well known)automatically.

[0068] While the foregoing has been with reference to a particularembodiment of the invention, it will be appreciated by those skilled inthe art that changes in this embodiment may be made without departingfrom the principles and spirit of the invention as set forth in theappended claims.

1. A computer system for manipulating an XML document using a relationaldatabase, comprising: a converter that receives an XML document andgenerates a pre-determined set of relational database tables based onthe XML document; a database for storing the relational database table;and a searcher for querying the generated relational database table inthe database to locate content originally in the XML document that isnow stored in the relational database table wherein the located contentis returned to the user as a portion of an XML document.
 2. The systemof claim 1, wherein the converter further comprises a software modulethat generates a unique name attribute for each node in the XMLdocument.
 3. The system of claim 2, wherein the converter furthercomprises a software module that generates a path attribute for aparticular node of the XML document wherein the path attribute comprisesa list of the name attributes for the one or more nodes from theparticular node to a root node of the XML document.
 4. The system ofclaim 3, wherein the converter further comprises a software module thatgenerates an order attribute for the particular node, the orderattribute comprising an enumerated order of the particular node from theroot node to the particular node.
 5. The system of claim 4, wherein theconverter further comprises a software module that generates a NodeValueattribute containing a value of the particular node.
 6. The system ofclaim 5, wherein the searcher further comprises a query generator thatgenerates a query into the database to find a piece of information inthe database corresponding to information in a node of the XML documentand a converter that converts the results of the query into portions ofan XML document that are displayed to the user.
 7. The system of claim2, wherein the name attribute for each node in the XML document isstored in a hash table so that the name attributes are retrieved fromthe hash table instead of the database.
 8. The system of claim 2,wherein the name attributes of the nodes of the XML document are dividedinto one or more categories so that related name attributes are groupedtogether.
 9. The system of claim 1, wherein the name attributes areencoded using base-64 encoding.
 10. The system of claim 3, wherein theconverter further comprises a software module that generates a reversepath comprising the list of name attributes from the path attribute inreverse order.
 11. The system of claim 1, wherein the converter furthercomprises a transform engine that converts Xpath expressions in the XMLdocument into SQL queries.
 12. A computer system for storing an XMLdocument using a relational database, comprising: a converter thatreceives an XML document and generates a relational database table basedon the XML document; the converter further comprising a software modulethat generates a unique name attribute for each node in the XMLdocument, a software module that generates a path attribute for aparticular node of the XML document wherein the path attribute comprisesa list of the name attributes for the one or more nodes from theparticular node to a root node of the XML document, a software modulethat generates an order attribute for the particular node, the orderattribute comprising an enumerated order of the particular node from theroot node to the particular node, and a software module that generates aNodeValue attribute containing a value of the particular node.
 13. Amethod for manipulating an XML document using a relational database,comprising: generating a relational database table based on an XMLdocument wherein the information about each node of the XML document isstored in a row of the table; storing the relational database table in adatabase; and querying the generated relational database table in thedatabase to locate content originally in the XML document that is nowstored in the relational database table wherein the located content isreturned to the user as a portion of an XML document.
 14. The method ofclaim 13, wherein generating the table further comprises generating aunique name attribute for each node in the XML document.
 15. The methodof claim 14, wherein generating the table further comprises generating apath attribute for a particular node of the XML document wherein thepath attribute comprises a list of the name attributes for the one ormore nodes from the particular node to a root node of the XML document.16. The method of claim 15, wherein generating the table furthercomprises generating an order attribute for the particular node, theorder attribute comprising an enumerated order of the particular nodefrom the root node to the particular node.
 17. The method of claim 16,wherein generating the table further comprises generating a NodeValueattribute containing a value of the particular node.
 18. The method ofclaim 17, wherein querying the database further comprises generating aquery into the database to find a piece of information in the databasecorresponding to information in a node of the XML document andconverting the results of the query into portions of an XML documentthat are displayed to the user.
 19. The method of claim 14 furthercomprising retrieving the name attribute for each node in the XMLdocument from a hash table so that the name attributes are retrievedfrom the hash table instead of the database.
 20. The method of claim 14,wherein the name attributes of the nodes of the XML document are dividedinto one or more categories so that related name attributes are groupedtogether.
 21. The method of claim 13, wherein the name attributes areencoded using base-64 encoding.
 22. The method of claim 15, whereingenerating the table further comprises generating a reverse pathcomprising the list of name attributes from the path attribute inreverse order.
 23. The method of claim 13, wherein generating the tablefurther comprises converting Xpath expressions in the XML document intoSQL queries.
 24. A data structure that stores a node of interest of anXML document in a relational database, the data structure comprising: anXMLName attribute comprising a unique name for the node of interest; aNamePath attribute comprising a list of the XMLName attributes for theone or more nodes from the node of interest to a root node of the XMLdocument; an OrderPath attribute comprising an enumerated order of thenode of interest from the root node to the node of interest; and aNodeValue attribute containing a value of the node of interest.
 25. Thedata structure of claim 24, wherein the data structure comprises a tablein a relational database and each attribute comprises a column in thetable in the relational database.